Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Family relation extraction from Wikipedia by self-supervised learning
ZHU Suyang, HUI Haotian, QIAN Longhua, ZHANG Min
Journal of Computer Applications    2015, 35 (4): 1013-1016.   DOI: 10.11772/j.issn.1001-9081.2015.04.1013
Abstract531)      PDF (773KB)(670)       Save

Traditional supervised relation extraction demands a large scale of manually annotated training data while semi-supervised learning suffers from low recall. A self-supervised learning based approach was proposed to extract personal family relationships. First, semi-structured information (family relation triples) was mapped to the free text in Chinese Wikipedia to automatically generate annotated training data. Then family relations between person entities were extracted from Wikipedia text with feature-based relation extraction method. The experimental results on a manually annotated test family network show that this method outperforms Bootstrapping with F1-measure of 77%, implying that self-supervised learning can effectively extract personal family relationships.

Reference | Related Articles | Metrics
Chinese cross document co-reference resolution based on SVM classification and semantics
ZHAO Zhiwei GU Jinghang HU Yanan QIAN Longhua ZHOU Guodong
Journal of Computer Applications    2013, 33 (04): 984-987.   DOI: 10.3724/SP.J.1087.2013.00984
Abstract1005)      PDF (642KB)(582)       Save
The task of Cross-Document Co-reference Resolution (CDCR) aims to merge those words distributed in different texts which refer to the same entity together to form co-reference chains. The traditional research on CDCR addresses name disambiguation posed in information retrieval using clustering methods. This paper transformed CDCR as a classification problem by using an Support Vector Machine (SVM) classifier to resolve both name disambiguation and variant consolidation, both of which were prevalent in information extraction. This method can effectively integrate various features, such as morphological, phonetic, and semantic knowledge collected from the corpus and the Internet. The experiment on a Chinese cross-document co-reference corpus shows the classification method outperforms clustering methods in both precision and recall.
Reference | Related Articles | Metrics
Comparative analysis of impact of lexical semantic information on Chinese entity relation extraction
LIU Dan-dan PENG Cheng QIAN Long-hua ZHOU Guo-dong
Journal of Computer Applications    2012, 32 (08): 2238-2244.   DOI: 10.3724/SP.J.1087.2012.02238
Abstract929)      PDF (1150KB)(397)       Save
A method was proposed to incorporate semantic information based on TongYiCi CiLin and HowNet into tree kernel-based Chinese relation extraction, the impact of these two kinds of semantic information on Chinese entity relation extraction was compared and analyzed, and the interrelation between lexical semantic information and entity type information was explored. The experimental results show that this method can improve the performance of Chinese relation extraction in some degree, and TongYiCi CiLin can complement the entity type information to a certain extent. Therefore, no matter whether the entity type information is involved or not, its semantic information can significantly improve the extraction performance for most of the relation types, while some conflicts exist between HowNet and the entity type information, leading to its performance improvements only for several relation types when entity types are provided.
Reference | Related Articles | Metrics